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Abstract 

We consider a random graph process in which, at each time step, a 
new vertex is added with m out-neighbours, chosen with probabiUties 
proportional to their degree plus a strictly positive constant. We show 
that the expectation of the clustering coefficient of the graph process 
is asymptotically proportional to -^^. BoUobas and Riordan [3] have 
previously shown that when the constant is zero, the same expectation is 
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m ■ 1 Introduction 

(•~^ ■ Recently there has been a great deal of interest in the structure of real world 

OO ! networks, especially the internet. Many mathematical models have been pro- 

^D ■ posed: most of these describe graph processes in which new edges are added 

by some form of preferential attachment. There is a vast literature discussing 

empirical properties of these networks but there is also a growing body of more 

. , rigorous work. A wide-ranging account of empirical properties of networks can 

j^ ' be found in [5] ; a good survey of rigorous results can be found in [^ or in the 

recent book [?]■ 

In jT^ Watts and Strogatz defined 'small-world' networks to be those having 
small path length and being highly clustered, and discovered that many real 
world networks are small-world networks, e.g. the power grid of the western 
USA and the collaboration graph of film actors. 

There are conflicting definitions of the clustering coefficient appearing in the 
literature. See [3] for a discussion of the relationships between them. We define 
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the clustering coefficient, C{G) of a graph G as follows: 

3 X number of triangles in G 



C{G) 



V (d(v)\ 

l^veV(G) \ 2 / 



where d{v) is the degree of vertex v. 

The reason for the three in the numerator is to ensure that the clustering 
coefficient of a complete graph is one. This is the maximum possible value for a 
simple graph. However our graphs will not be restricted to simple graphs and so 
the clustering coefficient can exceed one. For instance if we take three vertices 
and join each pair by m edges then the clustering coefficient is m? I{2m — 1). 
Note that the clustering coefficient of a graph with at most m edges joining any 
pair of vertices is at most m. 

In this paper we establish rigorous results describing the asymptotic be- 
haviour of the clustering coefficient for one class of model. Our graph theoretic 
notation is standard. Since our graphs are growing, we let dt{v) denote the total 
degree of vertex v at time t. Sometimes we omit t when the context is clear. 

The Barabasi-Albert model (BA model) |1J is perhaps the most widely stud- 
ied graph process governed by preferential attachment. A new vertex is added 
to the graph at each time-step and is joined to m existing vertices of the graph 
chosen with probabilities proportional to their degrees. A key observation |lj is 
that in many large real- world networks, the proportion of vertices with degree 
d obeys a power law. 

In [4 BoUobas et al. gave a mathematically precise description of the BA 
model and showed rigorously that for d < nT^ , the proportion of vertices with 
degree d asymptotically almost surely obeys a power law. 

A natural generalisation of the BA model is to take the probability of at- 
tachment to V at time t + 1 to be proportional to dt{v) +a, where a is a constant 
representing the inherent attractiveness of a vertex. Buckley and Osthus [S] 
generalised the results in [4^ to the case where the attractiveness is a positive 
integer. A much more general model was introduced in [B^^ and further results 
extending [4] were obtained. Many more results on these variations of the basic 
preferential model can be found in [3]. 

Bollobas and Riordan showed [5] that the expectation of the clustering coeffi- 
cient of the model from 3] is asymptotically proportional to (log nf' jn. Bollobas 
and Riordan also considered in [5] a slight variant of the model from ^ . Their 
results imply that for this model the expectation of the clustering coefficient 
is also asymptotically proportional to (logn)^/n. We work with a model de- 
pending on two parameters /?, m, which to the best of our knowledge was first 
studied rigorously by Mori in [10 . In a sense, that we make precise in the next 
section, Bollobas and Riordan's model is almost the special case of Mori's model 
corresponding to /3 = 0. 

Our main result is to show that for /3 > 0, asymptotically the expectation 
of the clustering coefficient is proportional to logn/ri. The main strategy of 
our proof follows [3J and we use very similar notation. In Section [2] we give 
a definition of the model that we use and explain its relationship with the 
model studied in [B] . Section [3] contains results that give the probability of the 
appearance of a small subgraph. We obtain the expectation of the number of 
triangles appearing and of ^^ ( \^'^ in Section U) These two sections follow [3] 
quite closely. The overall aim is to express the expectation of the clustering 



coefRcient as the quotient of the expectation of the number of triangles and the 
expectation of ^^ ( ^2"-') . We justify doing this in Section [5] and make use of 
a concentration result proved in Section [5] using martingale methods. Bollobas 
and Riordan [3] used a similar strategy and mentioned that they also used 
martingale methods. 



2 The model of Mori 

We now describe in detail Mori's generalisation of the BA model [H]. Our 
definition involves a finer probability space than was described in [llj but the 
imderlying graph process (G^ «) is identical. The process depends on two pa- 
rameters: m the outdegree of each vertex except the first and /3 £ M such that 
/3 > 0. (In [Tl], Mori imposed the weaker condition that /3 > —1). 

We first define the process when m = I. Let G} g consist of a single vertex 
vi with no edges. The graph G"'g^ is formed from G" « by adding a new vertex 
w„+i together with a single directed edge e. The tail of e is u„+i and the head 
is determined by a random variable fn+i- We diverge slightly from }llj in our 
description of fn+i- 

Label the edges of G"^ with 62, ... , e„ so that e,; is the unique edge whose 
tail is Vi. Now let 

n^+i - {(1,1;), . . . , {n,v), (2, h),..., {n, h), (2,i), . . . , {n,t)}. 

We define /„+i to take values in ri„+i so that for 1 < i < n, 



Pr(/„+i = (z,i;)) 



(2 + l3)n-2 



and for 2 < i < n, 

Pr(/„+i = (*,/!)) =Pr(/„+i-(i,t)) 



(2 + I3)n 



The head of the new edge added to the graph at time n+ 1 is called the target 
vertex of Vn+i and is determined as follows. If fn+i — (*,w) then the target 
vertex is Vi and we say that the choice of target vertex has been made uniformly. 
If fn+i = ih h) then the target vertex is the head of e^ and if fn+i = (i, i) then 
the target vertex is the tail of e^, that is Vi. When one of the last two cases 
occurs, we say that the choice of target vertex has been made preferentially by 
copying the head or tail, as appropriate, of e^. Suppose we think of an edge as 
being composed of two half-edges so that each half-edge retains one endpoint 
of the original edge. Then the target vertex is chosen, either by choosing one 
of the n vertices of G" a uniformly at random or by choosing one of the 2n — 2 
half-edges of G"^ uniformly at random and selecting the vertex to which the 
half-edge is attached. 

The definition implies that for 1 < i < ri, the probability that the target 
vertex of Vn+i is Vi is equal to 



(2 + /3)n-2 



(2.1) 



We might have defined /n+i to be a random variable denoting the index of the 
target vertex oivn+i and talcing probabilities as given in (j2.ip . Indeed for much 
of the sequel we will abuse notation and assume that we did define fn+i hi 
this way. However it is useful to have the finer definition when we prove the 
concentration results in Section [5l 

We extend this model to a random graph process (G^ ^) for m > 1 as follows: 
run the graph process {G\ g) and form G^ ^ by taking G"*^ and merging the 
first m vertices to form wi, the next m vertices to form V2 and so on. 

Notice that our definition will not immediately extend to the case P — 
because when n = 1, the denominator of the expression in (|2.ip is zero and 
so the process cannot start. One way to get around this problem is to define 
Gf to be the graph with two vertices joined by a single edge and then let the 
process carry on from there. A second possibility used in [3], is to attach an 
artificial half-edge to vi at the beginning. This half-edge remains present all 
through the process so that the sum of the vertex degrees at time n is 2n — 1 
rather than 2t7, — 2 as in the model we use. However it turns out that the choice 
of which alternative to use makes no difference to the asymptotic form of the 
expectation of the clustering coefficient and so the results from [^ are directly 
comparable with ours. 

In the following we only consider properties of the underlying undirected 
graph. However, it is helpful to have the extra notation and terminology of 
directed graphs to simplify the reading of some of the proofs. 



3 Subgraphs of G^^ 



f3 



Let S" be a labelled directed forest with no isolated vertices, in which each 
vertex has either one or no out-going edge and each directed edge {vi,Vj) has 
i > j. Moreover if vi belongs to S than this vertex has no outgoing edge. The 
restrictions on S are precisely those that ensure that S can occur as a subgraph 
of the evolving Mori tree with m = 1. We call such an S" a possible forest. 

In this section we generalise the calculation in |2j to calculate the probability 
that such a graph S" is a subgraph of G"^ for /3 > 0. We will follow the method 
and notation of 3 closely. 

We emphasise that we are not computing the probability that G" ^ contains 
a subgraph isomorphic to S] the labels of the vertices of S must correspond to 
the vertex labels of G" « for S to be considered to be a subgraph of G" « . 

Denote the vertices of S" by Ws^ , . . . , Vg^ , where Sj < s^+i for 1 < j < A: — 1. 
Furthermore, let 

V^ = {vi e V{S) : there is a, j > i such that {vj,Vi) e E{S)} 

and 

V^ = {vi e V{S) : there is a, j < i such that ivi,Vj) G E{S)}. 

Let d^g{v) {d'g^^{v)) denote the in-degree (out-degree) of v in S. In particular, 
^S'"*(") is either zero or one. For t > i, let Rt{i) ~ \{j > t : {vj,Vi) e E{S)}\. 
Observe that Ri{i) — d^g{vi). Moreover, let cs{i) — X]fc=i Ri-i{k)- Hence cs{i) 
is the number of edges in E{S) from {vi, . . . , w„} to {wi, . . . , Vi-i}. 



Lemma 1. Let /3 > and S he a possible forest. Then for t > Sk the probability 
that S is subgraph of G\ a is given by 

v,£V-[S) 

il(2 + /3)(-l)-2iiA +(2 + /3)(z-l)-2 

Vi(^V+ Vi<iV+ 

Proof. The proof is a generalisation of the proof for the analogous result in the 
case /3 = in [3] but we include it for completeness. 

Let St be the subgraph of 5* induced by the vertices {ui, . . . ,Vt}C\ V{S). We 
need to define the following random variables 

^ _ TT , y^ T{dt{v^) + l3 + Rt{i)) 

and 

^_ TT T T-r r(dt+iK)+/? + flt+iW) 

11 ^(-„-.)6S(G',+/)ll r(rf,+iK) + /3) 

where I a is the indicator of the event A. 

Note that dt{vj) ior 1 < j < t and Xt are functions of the random variables 
f2, ■ ■ ■ , ft while Yt is a function of the random variables /2, . . . , ft+i- However, 
for all j, Rt{j) is deterministic. 

Observe that 

^ ^ T{dt+i{vt+i)+f3 + Rt+iit + l)) ^ T{l + f3 + Rt+i{t + l)) 
*+' rK+i(^,+i) + /3) * r(l + /3) 

First, assume that there is no r < < such that {vt+i,Vr) £ ^(5') and so the 
new edge added at time t + 1 cannot belong to S. This implies that for i < t, 

Rt{i) = Rt+iii) and 

n hv,,v,)eE{G\J = n ^(v,,v,)eE{Gl%^)- 

{vi,Vj)eE{St) {vi,v,)£E{St + i) 

Furthermore for alH < i with i j^ ft+i, we have dt+i{vi) = dt{vi). We also have 
dt+i(w/t+i) = dt{vf^^,) + l. 

For the moment fix /2, . . . , /t so that Xt is completely determined. Now, 

^ _ A Rt{ft+i) \ ^ 



Thus 



E [Yt ~ Xt\f2, ...,/*]= E T^TZT^ P^(^*+i = ^)^* 

^ dt{Vr)+P 



{2 + l3)t~2 *■ 



By taking expectation with respect to /2 , . . . , /t we obtain 

and 

Now suppose (wt+i, fr) is an edge of S for some r < t + 1. If ft+i =/= r then 
Xt+i = so we will suppose that ft+i = r. Then for all i < i with i ^ ?', 
dt+i{vi) — dt{vi), and dt+i{vr) = dt{vr) + 1. Furthermore for all i < t,i ^ r 
Rt+i{i) = Rtii), but Rt+i{r) = Rt{r) - 1. 

Hence providing ft+i = Vr, we have 

So 



[2 + f3)t - 2 dt{vr) + f3 (2 + 13)1 -2' 
Thus 

E [X,+i|/2, . . . , AJ - (2 + ^)i_2 f(TT73^ ^*- 

So by taking expectation with respect to f2, ■ ■ ■ , ft, 

^ [^*+^] - (2 + /3)i-2 f(TT^^ ^[^*]- (^-^^ 

Note that Xi = ^^^r(ffl^^^^ ^^^d that for t > Sfe, we have Pr{S C G*_^) = 
E [Xt]. Using ((TT|) and ([TS]) and noting that i?j(i) = for w^ V", we have 
for t > Sfc 

ViGV 

1 TT /^i , '=s(i) 



11 C9 4-mc,--n-9 11 i^ 



^llj2 + /3)(^-l)-2^14A (2 + /?)(.-!) -2 

This is easily seen to be equivalent to the expression in the statement of the 
lemma. D 

We now provide a more convenient form for the probability given in Lemma 
[TJ This calculation is almost identical to the analogous one in [3] so we omit 
the proof. 



Lemma 2. Let /3 > and S he a possible forest. Then for t > Sk the probability 
that S is a subgraph of G\ a is given by 

Pr(5 C Gil,) 



'^sM+p.:.':iv- ^(1+^) 



fe 



4 Calculation of Expectations 

Recall that the clustering coefficient C{G) of a graph G is given by 

3 X number of triangles in G 



C{G) 



l^veV(G) \ 2 ) 



In this section we calculate the expectations of the numerator and denominator 
of this expression. 

4.1 Expected Number of Triangles 

We adapt the methods used in [3] to the case /3 > 0. For fixed a < 5 < c, we 
first calculate the expected number of triangles in G^ ^ on vertices Va,Vb,Vc- 
Let G™^ be the underlying tree used to form G^^. Label the vertices of the 
tree vi,...,w^„„. A triangle on Va,Vb,Vc arises if there are vertices v'^_^,v'g^^ 
with (a — 1)771 + 1 < ai,a2 < am, w^^,ti[,^ with (b — l)m + 1 < bi,b2 < bm 
and WciJ^C2 with (c — l)m + 1 < ci,C2 < cm such that w^^ sends its outgoing 
edge to v'^^ , v'^^ sends its outgoing edge to v'^^ and v'^^ sends its outgoing edge 
to v'ij . For this to be possible, we need ci 7^ C2. Let S be the graph with 
vertices v'^^,v'^^,v'f,^,v'i,^,v'^^,v'^^ and edges iv'^^,v'^^), «,<J and K,,<J. 
Write ai = am — h, 02 = am — h, ^i = bm — I3, 62 = bm — Z4, Ci = cm — l^ and 
C2 = cm — Iq. The cases where ai = 02 and ai 7^ a2 are slightly different. We 
concentrate on the former to begin with. 

We have ^'5(^01) — 2, ^"(fba) = 1 and otherwise rf" (f) = 0. Suppose that 
fli > 1. Then applying Lemma H] we see that 



Pr(S' C G7^^) 

r(3 + /3)r(2 + /3) 



(r(l + /?))2 (2 + /3)3 Vaia262(6icic2)i+/' 



l/(2+/3) 

exp(0(l/a)). 
(4.1) 



The same expression holds when ai — I because the extra multiplicative term 
of /3/(2 + /3) may be absorbed into the error term. Note that for — 1 < a; < 1, 
we have e^ — 1 + 0{x). Furthermore l/a^ — l/(am)(l + 0(l/a)), 1/bi = 
l/(6m)(l + 0(l/a)) and l/c^ = 1/(cto)(1 + 0(l/a)). So we may rewrite (|tT|) 
as follows: 

(1 + ^2 1 / 1 \l/(2+/3) 

Pr(5 C GTp) = X, J, -^ .,,^. 9^9f, (l + 0(l/a)). 



In this case where ai = 02, there are to^(to— 1) ways to choose oi, a2, 61, 62, ci, C2 
so that there is a corresponding triangle on Va,Vb, Vc in G^^ n. 

Now we suppose that oi ^02. We have d™{vai) — ^5 (waa) = '^s (^f^a) — 1 
and otherwise ^^(w) = 0. Applying Lemma [5] and carrying out similar calcula- 
tions to those above we obtain 

In this case there are m'^{ra — 1)^ ways to choose oi, a2, 61, 62, ci, C2. 

Let Na^b,c denote the number of triangles on Va,Vb,Vc in G"^^. From the 
calculations above, we see that 



1 + /3)' , ^.^ .^2(l + /3) = 



E [A^a^fcx] - m{m - 1) ) ' [ + m{m - 1) 



(l + 0(l/a)). 



l/(2+/3) 



(4.2) 



Now let N be the number of triangles in G"^ ^. Then to calculate E [N] we 
merely sum (|4.2p over all a, b, c with a < b < c. If we estimate this sum by 
integrating, we obtain the following. 

Proposition 1. For /3 > 0, the expected number of triangles in G^ a is 
m(m-l)^i±/l^+m(m-lf^i±^)logn + 0(l). 

This result is very different from that obtained in 3] where it is shown that 
when /3 = the expected number of triangles is 8((logn)3). 



4.2 Expectation of Ev6V(g) C'^a"^) 



We begin by noting that if we regard each edge in the graph as consisting 
of two half-edges, with each half-edge retaining one endpoint of an edge then 
St,Gy(G" ) ^"2 ) ^^ ^^^ number of pairs of half-edges with the same endpoint. 
We say such a pair of half-edges is adjacent. Suppose that ei and 62 are half- 
edges with endpoint v. If ei and 62 form respectively half of edges vu and vw 
with u, V, w pairwise distinct then we say that ei and 62 form a non-degenerate 
pair of adjacent half-edges. Otherwise we say that they are degenerate. 

Calculating the expected number of pairs of adjacent half-edges is slightly 
more complicated than calculating the expected number of triangles because 
there is less symmetry. We begin by counting the number of non-degenerate 
pairs of adjacent half-edges. Let a < b < c. We first calculate the expected 
number of pairs {vb, Va), {vc, Va) of adjacent half-edges in G^ ^ for /? > 0. Just as 
in the previous section, there are two cases to consider, and similar calculations, 
using Lemma [21 to those above show that the number of such pairs of adjacent 
half-edges is 

1 + (l + /3)2\ / 1 \l/(2+/3) 

-^ + -(- - 1)|^) (^^.^w^j (1 + OHM). 



By integrating, we see that the total number of pairs of adjacent half-edges in 
^m (3 f'^'' which the common vertex has the smallest index is 



2 
m- 



,^(™_l)i±^')„ + 0(n2/(2+/3)^ 



/3 
Now the expected number of pairs {vh, Va), {vc, Vb) of adjacent half-edges is 



2(1+/3)V 1 V^^'^"'' 



Again we integrate to derive that the total number of pairs of adjacent half-edges 
in G^ for which the common vertex has the middle index is m^n-|-0(n^/*^^+''^). 
This is not surprising because it can be shown that very few vertices either have 
loops or do not have m distinct out-neighbours. Each loopless vertex with m 
distinct loopless out-neighbours, that each have m distinct out- neighbours, is 
the vertex with greatest index in m^ pairs of adjacent half-edges of this form. 

Finally the expected number of pairs (vc^Va), {vc,Vb) of adjacent half-edges 
is 

(1 + /3)V 1 xi/(2+« 



So the total number of pairs of adjacent half-edges in GJ^ ^ for which the com- 
mon vertex has the largest index is m{m — l)/2n + 0(n^/(^+^)). Again this is 
not surprising because each loopless vertex with m distinct out-neighbours is 
the vertex of greatest index in (™) pairs of adjacent half-edges of this form. 

By carrying out similar calculations to those above, it can be shown that 
the number of degenerate pairs of adjacent half-edges is 0{n^^^^^^^). 

Summing over all the possibilities we obtain the following result. 

Proposition 2. For /3 > 0, the expectation of J2vev(G) ( 2^ ) ^'^ ^m,/3 ** 

5/3 2 , 2-/? \ ^. 2/(2+/3)^ 



m' + ^-fm] n + 0{n'^^'+''^). 



2(i 2/3 

Again the result is different from that obtained in [3J where it was shown that 
for the case /3 = 0, the expected number of pairs of adjacent edges is 8(nlogn). 

5 Concentration of X^veV(G) ( 2^ ) 

In this section we show that the number of pairs of adjacent half-edges in G^ ^ 
is concentrated about its mean. This justifies obtaining the clustering coefficient 
by taking three times the quotient of the expected number of triangles and the 
expected number of pairs of adjacent half-edges. The main strategy is to apply 
a variant of the Azuma-Hoeffding inequality from |^, by making use of Mori's 
results [Tl ] on the evolution of the maximum degree of GJ^ o. A key notion in 
the proof is to consider the mechanism by which edges incident with a fixed 
vertex are added. 

Fix (3 and m. Let [Ht) be the graph process defined as follows. Run {G\ a) 
and take 7J„ to be the graph formed from G" o by merging groups of m con- 
secutive vertices together until there are at most m left and finally merging the 



remaining unnierged vertices together. Note that Hn has [ti/tti] vertices, which 
we denote by w i, . . . , vrn/m~\ in the obvious way, and n — 1 edges. Furthermore, 
if m|n and the graphs Hn and G^ T are formed from the same instance of the 

process {G\ g), then Hn and G^'T are the same graph. 

Let Vk be a vertex of Hg such that km < s. For t > s, we define a partition 
^k,sit) of the half-edges incident with Vk- The partition always has ds{vk) + 1 
blocks. When t = s, each block of the partition except for one contains one of 
the ds{vk) half-edges incident with Vk] with a slight abuse of nomenclature the 
other block, which we call the base block, is initially empty. It follows that if 
Vk has a loop at time s then the two half-edges forming the loop are in separate 
blocks of lifers (s). As t increases and more edges are added to H, any newly 
added half-edge incident with Vk is added to the partition. If at time t > s 
the target vertex of the newly added edge is not Vk then Ilk,s{t) = Ilk,s{t — 1). 
Suppose that at time t > s the target vertex of the newly added edge / is Wfc : if 
Vk is chosen preferentially by copying the half-edge e £ A, where A is a block of 
nfc,s(i — 1), then we form Uk^sit) from Ilk,s{t — 1) by adding the half-edge of / 
incident with Vk to A; if Vk is chosen uniformly then the half-edge of / incident 
with Vk is added to the base block. 

Suppose that vi is a vertex of H^ distinct from Vk such that Im < s. Suppose 
further that we choose two distinct blocks from Hk^sit) and lli^s{t), such that 
neither is a base block. The joint distribution of the sizes of the two blocks is 
the same for any choice of blocks, whether they are both chosen from Ilk,s{t), 
11; ,j(i) or one from each. Furthermore if we choose either base block from 
^k,s{t) or Ili^s{t) and one other block that is not a base block, then again the 
joint distribution of the sizes of the blocks does not depend on our choice. 

Lemma 3. Letvj andvk be distinct vertices of Hs such that Taax{jm, km} < s. 
Let A (B) he respectively a block ofIlj_s{t) (Tlk^s{t)) such that neither is a base 
block. Then 

E [\A\\B\] < E [\A\] E [|B|] < {t/s)^/''^+^\l + 0{l/s)). 

Proof. Let ei, 62 be half-edges so that at time s, ei is incident with Vk and 62 
is incident with vi. Then let at denote the size, at time t, of the block of Ilk,s{t) 
containing ei and let bt be defined similarly with respect to H; .j(t) and 62. We 
first establish the second inequality. We have E [a^] = 1 and for t > s, 

E[a*+iN=a*(l+(^^^). (5.1) 

Hence 

^ +^ t-2/(2 + /3) ^ ^ 



Solving this recurrence, we obtain 
E [at] = 



ru-^)r(.-^) 



^ [^ 2+p)^\^ 2+p) 



A standard result on the ratio of gamma functions [5j states that if a, 5 are fixed 
members of M then for all x > max{|a|, |6|}, 

I^ = .-(1 + G(l/.)). 
L[x + a) 

10 



Using this result, we obtain 

I^[at]<it/sy/^^+^\l + 0{l/s)). 

Since |A| and \B\ are identically distributed, the second inequality in the lemma 
follows. We prove the first inequality by using induction on t. Observe that 
(at+i, bt+i) can take the values (a* + 1, bt), {at, bt + 1) and (of , bt) with probabil- 
ities respectively at/((2+/3)i-2), bt/{{2 + l3)t-2) aiidl-{at + bt)/i{2+P)t~2). 
Therefore 

'E [at+ibt+i\atbt] = atbt + 



(2 + P)t-2 



and from (|5.ip we get 

E [at+i] E [bt+i] = E [at] E [bt] ( 1 
So 
E [at+ibt+i] - E [at+i] E [6t+i] < ( 1 



(2 + /3)i - 2 



(2 + /3)i 
and hence the result follows by induction. 



(EK6t]-EMEN) 



n 



When the maximum degree of Ht becomes unusually large and the target 
vertex is chosen to be a vertex of maximum degree, the number of pairs of 
adjacent edges increases by an unusually large amount. The next result enables 
us to show that the probability of this happening is extremely small. Let A(G) 
denote the maximum degree of G. The following is a very slight reformulation 
of what Mori proves in [TT, Theorem 3.1]. 

Theorem 1. For any positive integer k, there exists Mk € K, such that for all 



E 



AiGl^) + P 



<Mk 



,ll/(2+/3) 

The following corollary is straightforward. 

Corollary 1. For any positive integers k,m, there exists Mk^m G 
for all positive integers ii, . . . , ik, 



such that 



E 



A{Hmii] 



MH^rm,) 



<M, 



k.m- 



_(TOil)l/(2+/3) (toJ^)1/(2+/3) 

Proof. Since A(_ffmij), . . . , A{H.mi^) are all positive we have 



and so 



E 



A(gmn) 
(m«i)i/(2+/3) 



A(iJ™J 



MHrm.) 



(mu)i/(2+/3) - A^ V(™j)i/(2+/3) 



A.{Hmi.] 



A{Hmik) 



(mii)l/(2+/3) (j„i^)l/(2+/3) 






A(if„ 



(m^J)l/(2+/3) 
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Recall that H„ii is formed by merging together blocks of m consecutive vertices 



in an instance of G™^. So we have E [(A(H„„))'=] < E {mA{G'^j))'' 



Hence 



k 

E 



E 



(mij)i/(2+/3) 



k 

< m'' Y^ E 






< km''Mk. 



The result follows by taking Mk,- 



km'^Mk. 



U 



Before we can state the large deviation result that we use, we need some more 
definitions. Recall that fi is a random variable which determines the index of the 
target vertex of Vi and that the values taken by /2, /a, . . . , /t together determine 
Ht- Furthermore the set of values that fi can take is denoted by fi^ and f2, ■ ■ ■ , ft 
are independent. Let fl — Y[i=2 ^i- 

Let X = (/2, . . . , ft)- We let -fft(X) be the instance of Ht determined by 
the random variables /2, . . . , /*. We will also use this notation both for other 
random variables associated with Ht and when some or all of the fi 's are set to 
a particular value. The meaning should be clear from the context but we will 
generally use Ui for a member of fli and fi for a random variable taking values 
in fij. 

Let Z?(X) = E.ev(H,ix)) Cf^) and let F(X) = I?(X)i-2/(2+^). Now let 



5:n; 



=2 



n,. 



,^s,fs 



K such that 

(W2,- .. ,t^s) t-^ E[i^(w2, 

and let ran : Y[t=2 ^i ^ R such that 

(a;2,.. -j^s-i) ^ sup{|g(a;2,- •■ ,^^-1,2:) - g{uj2, ■ 



Jt)] 



,ujs-i,y)\ : x,y G flj. 



So ran(a;2, ■ ■ • ,'j-'s-i) measures the maximum amount that the expected value 
of -F'(X) changes when the value of fs is changed. 
For uj eil, let 



R'ico) 



t 

E 

k=2 



ran(w2,. ..,Uk-if 



Our aim is to bound B?{uj) as to runs over all members of 51 with the possible 
exception of those belonging to some 'bad' subset B which we hope to have 
small probability. We specify B below but for the moment let B be any subset 
of fi. Let 

r^ =sup{i?2(tj) :tjefi\B}. 

Then Theorem 3.7 in [9] yields the following inequality. For all a:; > 0, 
Pr(|F(X) - E [F(X)] \>x)< 2(e-2^'/''' + Pr(X e B)). 
Fix 5 > 0. We let 



65 = <^ X e fi : ^ 
Then we have the following. 



A(g„„(X)) 

(TOi)2/(2+/3) 



> n^^^ 
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Lemma 4. For any (5 > and 7 > 0, there exists L such that VriBs) < L-k^, 
where L is a constant depending on d,^,(3,m but not on n. 

Proof. For any positive integer fc, Markov's inequality gives 

k' 



E 



Pr{Bs) < 



2^1=1 \^(mi)2/(2+, 









The numerator of this fraction is equal to 

2 



E 



'" ■^ Wmii)i/(2+/3) 
.11=1 lfc=l ^^ ' 



/ A(g,„.,(a;)) 



1 



■ ifc)^/(^+^) 



Using Corollary [T] this is at most 



M- 



2k,m 



zi-1 



n 

•E 

ifc=l 



(mJ^ii 



.ij.)2/(2+/3) 



= M. 



2k.rn 



< M2k, 






Hence 



M2fc.. 



Pr(B,) < 



and so letting k = [7/(5] gives the result. 



2+/3 



1 



^kS 



D 



We can now state the main result of this section concerning the concentration 
of the number of pairs of adjacent half-edges around its expectation. 

Theorem 2. Let /3 > 0. For any e > 0, the number D of pairs of adjacent half- 
edges in G^ o is concentrated about its expected value within 0{n^'^^^''^'^^'^^>^'^). 
More precisely, for any e > and 7 > there exists n* such that for all n > n* 



i+ii 



Pr I £>-£[£>] I >n3+2F+M < 



1 



Proof. Let t = nm, and fix s < t. Let s' = m\s/m~\, so we have s' < t. Now let 



(^2, 



,U!s~l,X,UJs+i, 



,^t) 



and ujy = (w2,.- . ,ujs-i,y,ujs+i, ■ ■■ ,wt), 



where uji £ fti and x,y G fig. For z G {x, y}, let d^{v) denote the total degree of 
V at time t in Ht{LOz) and let e denote the edge added at time s. Suppose that 
in Ht{uJx) the target vertex of e is Vk^ and in Ht{ijjy) the target vertex of e is 
Wfcj • Note that at any time, for every vertex v other than vu-^ or Vk^ , the degree 
of V is the same in Ht{uJx) and Ht{ujy). Therefore F{lJx) — F{ujy) depends only 
on the degrees of Vk^ and Vk2 and is given by 



F{UJX) - F{LUy) 

^ ^-2/(2+/3) 



dfivkj 
2 



dtivk^) 
2 



dt{Vk^[ 

2 



dU'"k2 
2 



(5.2) 
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From now on we will assume that fci ^ fc2, because otherwise F{ujx)—F(ujy) = 0. 
Consider the changes that occur to Hs' if we replace Uy by ujx- First the head of 
e is moved from Wfc^ to Vk^ ■ Second it is possible that each of the at most m — 1 
edges that are added in the time interval [s + 1, s'] also have an endpoint moved 
from Vk2 to Vki '■ this will happen if the target vertex of an edge added in the 
interval [s + 1, s'] is chosen by preferentially copying the head of an edge which 
has been moved from w^^ to Vk^ , in particular if the target vertex is chosen by 
preferentially copying the head of e. Consequently we have 

4 ("fci ) + 1 < ^?' ("fci ) < 4 ("fci ) + ^ 

and furthermore 

d% (vfci ) + d% {vk, ) = 4 {vk, ) + dl, {vk, ) . 

Let d = d^, {vki ) — dj/ (wfci ), di = d^, {vk^ ) and ^2 — d^, [vk^ ) ■ Note that both di 
and d2 and consequently also \di — d2\ are at most A(7?s_i(wi, . . . ^Ws-i)) + ?7i. 
Now let Ao, Ai, . . . , Ad^, {Bq, Bi, . . . , Bd^) denote the blocks of the partition 
^ki_,s'{t) in Ht{ujy) {Ilk2,s'it) in Ht{uJx)) with Aq {Bq) denoting the base block. 
The partition Ilkj^^s'it) in Ht{ijJx) contains the blocks Af)^...^Ad-^ but also d 
further blocks which we label Ci, . . . , Cd- Then the partition Ilk2.s'{t) in Ht{ujy) 
contains the blocks Bq, . . . , Bd^, Ci, . . . ,Cd- So using (|5.2p . we have 



i=0 j=l 1=0 j=l 



Now let 



Ll>x = (w2,. .. ,W5_i,x,a;s+i,. ..,ujs',f, 



s' + l, ■ 



and 



CJj^ == {UJ2, ■ ■ ■ ,UJs-l,y,UJs+l, . . . ,LUs', fs' + l, ■ ■ ■ , ft)- 



So both Ht{ujx) and Ht{LOy) evolve deterministically until time s' but randomly 
thereafter. 

Recall that d < m and that \di — ^2! is at most A{Hs^i{uj2, . . . , Ws-i)) + ^ti. 
Hence from ()5.3p . Lemma [3] and the remarks immediately preceding the lemma, 
we see that 

|E [F{lOx) - Ficoy)] I < (A(i?,_i(c.2, . . . ,c.,_i))+m)m(l/s')2/('+'^Hl+0(l/s'))- 

Notice that this expression does not depend on a; or y and holds for every 

uJs+i , ■ ■ • , Wg/ . Consequently 

ran(c^2, • ■ ■ ,^s-i) < (A(iJ,_i(c^2, • ■ ■ ,^s-i)) + m)m(l/s'f/'^^+^\l + 0(l/s')). 
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Now let cj G f7 \ Bs. Then 

nm 

^"^ Z^( ^7275T^ ) (l + 0(Vs)) 

5 = 2 ^ ^ 

^4m \^\^ ^-^^^5:;:^ j (1 + 0(1A)) 

where c is a constant. 
Hence 



Pr f |I?(X) - E [Z)(X)] I > n^+"^ = Pr ('|i^(X) - E [F(X)] | > 






/ _2r7 2+3+"'' \ 
<2exp ^^ +2Pr(B5). 

If we choose 8 — t then the first term is at most 1^ for any 7 > and sufficiently 
large n. Applying Lemma [H with any 7* > 7 we see that for sufficiently large n 
we also have 2Pr(;Se) < -^. Hence the result follows. D 

6 Expected clustering coefficient 

In this section we finally state and prove our main result. 

Theorem 3. For any [3 > 0, the expected clustering coefficient of G^ a is given 
by 

nC{G^n,0)]^^^^ + O{l/n), 



whe 



and 



Ci = m{m - 1)- — -^ + m{m - 1)^ ^ ' 



C2 



2 + 5/3 2^2-/3 
m H 771. 

2/3 2/3 



Proof. Recall that N = N(G'^p), D = DiC^p) denote respectively the num- 
ber of triangles and pair of adjacent edges in GJ^^. The expected clustering 

coefficient is given by E [c(G",,/3)j = E [iN/D]. 

Choose e so that < e < j^jn and let 77 = e + ^^ < 1. Let / denote 
the interval [E [D] - n'', E [D] + n^]. From Proposition [U we have E [D] - n'' = 
C2n - (1 + o(l))n'' and E [D] + n'' == C2n + (1 + o(l))n''. Let n > n* , the 
minimum value of n such that Theorem [5] may be applied with 7 = 4. Since 
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C{G^^ g) < m, an upper bound for E C{G^^ «) may be obtained as follows 



E [C(G:;,^)] < E E f P^(^ = J,D = z)+ m ¥v{D ^ I) 

3 = 1 iei 

< y" ,, ■' ,,,, Pr(iV =])+m Pr{D ^ I). 

Applying Theorem [2] with 7=1 and then Proposition [1] we obtain 

E \C{Gl, g)] < y ,y ,^,, PiiN = j) + - 

j=i 

C2n n 

= h 0(1/71). 

A lower bound for E{C{G^^g)) may be obtained as follows. 
E [CiG^^g)] > E E f P^(^ = ■?■' ^ = *) 

oo o ■ 

"ptfb^2n + (l + o(l))n'^ 

3E[iV] 



C271 + (1 + 0(1))71'' 
oo o ■ 

Now since there are at most n^m? triangles in G"^ g 

"H^^ C2n+ (1 + o(l))n'' C2n+ (1 + ofljjn') 

Applying Theorem [2] with j — 4 shows that this is O {l/n). Finally 

3E[N] Scilogn ^ 3cilogn , ^.-, , ^ 
— -(1 - (1/C2 + o(ljjn' ) = (-0(1/71). 



C27i + (1 + o(l))n'' C271 C271 

D 

7 Conclusion 

Our main result shows that for /? > the expectation of the clustering coefhcient 
of the Mori graph is asymptotically proportional to log 71/71 and consequently 
that the Mori graphs do not have the small-worlds property. Bollobas and 
Riordan showed for an almost identical model that when P — 0, the expectation 
of the clustering coefficient is asymptotically proportional to (log7i)^/n. An 
unexpected consequence, for which we do not yet have a good explanation, is 
that the clustering coefficient has a discontinuity a,t j3 — 0. 
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